Rank | Count | Beginning |
---|---|---|
4928 | 2294 | Ta |
3302 | 1279 | She |
6592 | 959 | Ta'n |
8684 | 395 | T'eh |
9147 | 321 | Va |
119 | 283 | Ayns |
619 | 251 | Cha |
5490 | 227 | T'ad |
8474 | 205 | T'ee |
9361 | 192 | Va'n |
2830 | 180 | Ren |
4734 | 175 | 'Sy |
9736 | 166 | V'eh |
3011 | 140 | Rere |
2622 | 135 | Ny |
1362 | 134 | Er |
2386 | 129 | Myr |
1103 | 123 | Dy |
448 | 111 | By |
1557 | 102 | Foddee |
1817 | 98 | Haink |
1250 | 80 | Ec |
2006 | 65 | Hooar |
1942 | 55 | Hie |
2387 | 46 | Myrane |
4677 | 46 | Son |
9 | 44 | Agh |
1685 | 43 | Ga |
2107 | 40 | Hug |
2303 | 38 | Lurg |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV